Real Time Information Extraction from Microblog
نویسندگان
چکیده
This paper present the participation of Information Retrieval Lab(IR LAB DA-IICT Gandhinagar) in FIRE 2016 Microblog Track. The main objective of the track is to identify Information Retrieval methodologies to retrieve important information from Twitter posted during the disasters. We have submitted two runs for this track. In the first run, daiict irlab 1, we have expanded topic term using Word2vec model trained by the tweet corpus provided by the organizer. Relevance score between tweet and corpus are calculated by Okapi BM25 model. Precision@20 ,primary metric, for this run, is 0.3143. In the second run,daiict irlab 2, we have set different weight for original term and expanded topic term, we achieve precision@20 around 0.30.
منابع مشابه
Microblog Track 2011 of FDU
Twitter provides huge amount of short messages, raises challenge problems to the research community. The Microblog Track of TREC detects the special behavior of the twitter dataset in the “real-time” retrieval task. This paper reports our participation in the Microblog Track task. Given the query topics, each participants are required to conduct a “real-time” retrieval task, which seeks for the...
متن کاملUnderstanding Status Update in Microblog: A Perspective on Media Needs
Microblog has grown popularly as a seminal social medium for timely information seeking and sharing. However, the reason why individuals update real-time information in microblog has not been well understood, and empirical research to address this specific information behavior is hardly available. As a felt urge can be conceptualized as a precursor of real-time updating in the microblog, we att...
متن کاملITEPE: A Source Tracing Algorithm for the Microblog
Finding the true source of a social network is a crucial component of social network information tracing. Using the new media microblog as an example, this paper provides a source tracing algorithm ITEPE (Initiators and Early Participants Extraction) to solve this problem. First, the cascade (session tree) is built according to the retweeting of a microblog, after which the cascade set (session...
متن کاملDiamonds in the Rough: Event Extraction from Imperfect Microblog Data
We introduce a distantly supervised event extraction approach that extracts complex event templates from microblogs. We show that this near real-time data source is more challenging than news because it contains information that is both approximate (e.g., with values that are close but different from the gold truth) and ambiguous (due to the brevity of the texts), impacting both the evaluation ...
متن کاملA Real-time Search Structure and Classification Algorithm of Microblog Based on Partial Indexing
With the popularity and the rapid development of social networks, microblog has also obtained great attention and gotten wide application. Microblog may produce a large amount of information per second. The real-time search is a critical technology to timely get the latest high-quality information from microblog. A new real-time search framework based on meta-search engine and a query classific...
متن کامل